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1.  Introduction 


Detection  of  people  is  one  of  the  important  tasks  in  intelligence,  surveillance,  and  reconnaissance 
(ISR)  requirements.  For  example,  in  perimeter  protection,  one  would  like  to  detect  any  intruders 
during  day  and  night  so  that  proper  authorities  can  be  alerted  for  appropriate  action.  In  urban 
operations,  one  would  like  to  make  sure  once  a  building  is  evacuated  nobody  entered  the 
building — this  implies,  sensors  should  detect  people  entering  the  building.  Homeland  Security 
often  requires  detection  of  illegal  aliens  crossing  the  border.  There  are  numerous  other 
applications  where  personnel  detection  is  important. 

Detection  of  people  is  a  challenging  problem.  For  example,  acoustic  sensors  may  analyze  the 
sound  and  determine  if  there  is  any  human  voice  present.  However,  if  the  people  are  not  talking, 
the  acoustic  sensors  may  not  be  able  to  detect  people  based  on  the  voice  analysis.  So,  other 
sensors  such  as  seismic,  passive  infrared  (PIR),  sonar,  ultrasound,  radar,  magnetic,  and  electric 
field  (E-field)  sensors  should  be  used  for  detection  of  people,  since  no  single  sensor  will  be  able 
to  detect  in  every  situation  and  circumstance.  Notice  that  the  emphasis  is  on  non-imaging 
sensors,  since  they  tend  to  be  low  power  and  long  lasting.  Video  sensors  are  often  high  power 
consuming  and  require  frequent  replacement  of  batteries;  hence,  there  is  a  higher  chance  of 
compromising  the  mission.  As  such,  the  sensors  used  should  consume  little  power  and  last  long 
on  batteries.  For  these  reasons,  majority  of  the  sensors  used  for  personnel  detection  tend  to  be 
acoustic,  seismic,  PIR,  magnetic,  E-field  sensors,  to  name  few.  However,  when  one  is  collecting 
the  data  for  development  of  algorithms,  truth  data  are  vital.  For  this  purpose,  we  use  video 
cameras  to  capture  the  truth  data. 

The  people  participated  in  this  data  collection  are  listed  below: 

1.  U.S.  Army  Research  Laboratory  (ARL),  Adelphi,  MD,  USA 

2.  Night  Vision  and  Electronic  Sensors  Directorate  (NVESD),  Ft.  Belvoir,  VA,  USA 

3.  Space  &  Naval  Warfare  Systems  Command  (SPAWAR),  San  Diego,  CA,  USA 

4.  University  of  Mississippi,  Oxford,  MS,  USA 

5.  University  of  Memphis,  Memphis,  TN,  USA 

6.  Canadian  Defense  Organization 

7.  Israeli  Team 

8.  Finnish  Defense  Organization 

We  now  present  some  descriptions  of  the  equipment  used  by  various  parties. 
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1.1  ARL  Data  Collection  System 

ARL  has  brought  three  sensor  systems,  which  are  used  to  collect  the  data  for  the  choreographed 
scenarios.  It  also  brought  another  three  sensors  systems  to  collect  the  data  of  animals  in  their 
natural  habitat —  and  the  data  are  collected  day  and  night. 

One  of  the  commercially  available  data  collection  systems,  namely,  “Wavebook,”  is  primarily 
used  for  data  collection  for  choreographed  scenarios  where  people  and  animals  walked  along  the 
trails  in  an  orderly  fashion.  It  has  eight  channels  for  data  acquisition.  The  sampling  rate,  and  the 
aliasing  filters  can  be  preprogrammed  as  desired.  The  following  sensors  are  used  on  the 
Wavebook: 

•  Acoustic,  seismic,  PIR  and  ultrasonic  sensor  suite 

An  automatic  data  collection  unit  (ADCU)  is  used  at  remote  sites  to  collect  the  data  around  the 
clock  for  animals  in  their  natural  habitat.  The  system  is  capable  of  collecting  data  on  eight 
channels  at  4  k  samples  per  second.  The  sensor  suite  consists  of  the  following: 

•  Acoustic,  seismic,  PIR  and  ultrasonic  sensor  suite 

The  sensors,  namely,  acoustic,  seismic,  PIR  and  ultrasonic  sensors,  used  for  data  collection  are 
same  as  the  ones  used  during  the  2009  data  collection  effort  and  thus  have  been  described 
previously  (36).  The  Wavebook  data  collection  system  with  sensor  suite  deployed  near  trail  is 
shown  in  figures  1  and  2,  the  latter  of  which  shows  one  of  the  scenarios  being  enacted  during  the 
data  collection.  Figure  2  clearly  shows  several  people  and  animals  walking  the  trail. 


Figure  1 .  (a)  ARL  sensor  deployment  and  (b)  new  profiling  sensor. 
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Figure  2.  People  and  animals  walking  on  the  trail. 


1.2  Night  Vision  Data  Collection  System 

NVESD  brought  a  high-resolution  camera  to  the  field  to  collect  data.  The  camera  system  is 
shown  in  figure  3.  The  system  included  a  fish-eye  lens  to  see  the  targets  coming  from  all  around. 


Figure  3.  NVESD  camera  system. 

1.3  SPA  WAR  Data  Collection  System 

SPAWAR  brought  two  magnetic  sensors  and  deployed  them  along  the  trail.  The  sensor  is 
sensitive  enough  to  detect  people  passing  by  carrying  some  ferrous  material  (e.g.,  keys).  The 
sensors  are  shown  in  figure  4. 
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Figure  4.  SPAWAR’s  magnetic  sensors. 


1.4  University  of  Mississippi  Data  Collection  System 

University  of  Mississippi  deployed  a  similar  system  to  ARL’s  system  with  acoustic,  seismic  and 
ultrasonic  sensors.  The  deployed  system  is  shown  in  figure  5. 


Figure  5.  University  of  Mississippi  sensors. 

1.5  University  of  Memphis  Data  Collection  System 

University  of  Memphis  have  been  working  on  developing  new  sensor  system  to  replace  high 
power  consuming,  high  bandwidth  requiring  imaging  sensors  such  as  a  camera.  They  developed 
a  pyroelectric  array  profiling  sensor  with  fewer  pixels  to  capture  the  essence  of  an  image.  The 
deployed  sensor  is  shown  in  figure  6. 
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Figure  6.  Profiling  sensor  by  University  of  Memphis. 

1.6  Defense  Research  and  Development  Canada  SASNet 

The  Canadian  Defense  Organization  has  developed  a  low-cost  network  called  Self-healing 
Autonomous  Sensor  Networks  (SASNet)  for  detecting  and  tracking  targets.  Each  sensor  node  in 
the  SASNet  consists  of  acoustic,  seismic,  and  magnetic  sensors.  Figure  7  shows  some  elements 
of  SASNet. 


Figure  7.  SASNet. 


1.7  Israeli  Team 

The  Israeli  team  has  deployed  their  system  called  “Pearls  of  Wisdom,”  which  consists  of 
acoustic,  seismic,  magnetic,  and  imaging  sensors.  Their  system  is  shown  in  figure  8. 
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Figure  8.  Pearls  of  Wisdom  system. 


2.  Signal  Processing 


In  this  section,  we  present  some  of  the  advances  made  in  seismic  and  ultrasonic  signal  processing 
for  personnel  detection.  Most  of  the  advances  in  signal  processing  concentrated  on  acoustic 
(ultrasound)  and  seismic  sensors  as  they  offer  high  fidelity  to  distinguish  people  from  animals. 

2.1  Seismic  Signal  Processing 

The  main  purpose  of  a  seismic  sensor  is  to  detect  footfalls  of  humans  walking  within  the 
receptive  field  of  the  sensor.  There  is  a  considerable  amount  of  literature  (1-6)  in  footstep 
detection.  Traditionally,  researchers  have  focused  on  estimating  the  cadence.  However,  if 
multiple  people  are  in  the  vicinity  of  the  sensor  and  walking,  it  is  difficult  to  estimate  the 
cadence  of  an  individual  person.  Moreover,  if  there  are  animals,  it  is  difficult  to  differentiate 
multiple  people  walking  and  animals  walking  by  observing  the  footfalls.  However,  multiple 
footfalls  superimpose  on  one  another,  resulting  in  a  frequency  of  ‘c’  Hz  (where  ‘c’  is  an  effective 
cadence  of  multiple  walkers).  So,  a  seismic  algorithm  can  look  for  harmonics  of  cadence  or 
several  strong  frequency  components  between  2  to  15  Hz  to  distinguish  single  and  multiple 
walkers. 

The  seismic  algorithm  used  is  a  multivariate  Gaussian  classifier  ( 1-7)  with  the  feature  set 
consisting  of  amplitudes  of  the  frequency  bins  from  2  to  15  Hz.  Then,  an  algorithm  is  used  to 
estimate  the  posterior  probability  of  footsteps  present. 

The  algorithm  only  determines  whether  there  are  footsteps  present.  In  order  to  detect  the 
presence  of  humans,  it  is  necessary  to  determine  whether  these  footsteps  belong  to  a  human  or  an 
animal.  For  this,  we  invariably  turn  to  acoustics.  If  there  is  voice,  it  can  be  detected  and 
identified  as  a  human  voice  based  on  the  formants.  In  order  to  distinguish  people  and  animals 
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when  no  voice  is  present,  we  analyze  the  sound  generated  by  the  animals  walking.  When  a  single 
hoof  of  a  horse  strikes  the  ground,  it  produces  a  sound  pattern  that  is  distinct  from  that  of  a 
human  foot.  Figure  9  shows  the  signature  of  horse  walking  (for  a  period  of  6  s)  before  and  after 
noise  removal.  The  noise  removal  is  performed  using  empirical  mode  decomposition  ( 4 ,  6)  of 
the  original  signal  into  various  component  signals.  From  figure  9,  it  is  clear  that  there  are  three 
peaks  uniformly  distributed  in  each  time  interval  of  1  s.  This  indicates  the  cadence  of  the  horse  to 
be  approximately  2.8  to  3  Hz.  Since  the  cadence  of  a  person  is  around  1.5  to  2  Hz,  one  can  infer 
the  presence  of  animals. 


When  a  person  walks,  the  heel  of  the  foot  strikes  first  and  then  the  toe  end  of  the  foot  strikes, 
rubbing  against  the  ground,  creating  a  unique  seismic  signature  compared  to  that  of  an  animal 
(figure  10).  Animals,  in  general,  walk  on  their  hoof  or  “toe”  (the  horse  ankle  and  heel  or  fetlock 
do  not  strike  the  ground),  which  strikes  the  ground  producing  signatures  that  are  different  from 
those  of  people.  Both  the  signatures  have  different  frequency  response  on  the  same  ground.  In 
section  2.1.1,  we  present  a  technique  that  uses  the  differences  in  frequency  responses  to 
distinguish  both  types  of  footfalls. 


7 


Single  footstep 


Impulse 


Friction 


Front  leg 


T2  -100-150  ms 


Back  leg 


Figure  10.  Seismic  signature  of  a  single  footstep  of  a  person. 


2.1.1  Discrimination  of  Animal  and  Human  Seismic  Signatures 

For  a  single  walking  person,  detection  of  cadence  and  human  footstep  signature  is  relatively 
easy.  However,  when  animals  and  people  are  walking  in  the  vicinity  of  a  seismic  sensor,  the 
detection  of  human  foot  signature  is  not  as  straightforward.  If  there  are  multiple  sensors 
collecting  the  same  signatures,  one  can  use  principal  component  analysis  (PCA)  or  independent 
component  analysis  (ICA)  (9)  to  separate  the  human  footstep  signatures  from  the  animal 
footsteps  depending  whether  the  noise  is  Gaussian  or  not.  Most  of  unattended  ground  sensor 
(UGS)  systems  consist  of  only  one  sensor  per  modality,  that  is,  one  acoustic,  one  seismic,  etc.,  so 
it  is  not  possible  to  use  the  PCA  or  ICA  technique  for  blind  signal  separation,  since  PCA  or  ICA 
require  at  least  n  number  of  sensors  to  separate  n  sources.  In  acoustics,  several  researchers 
(10-14,  37)  have  developed  techniques  for  single  channel  source  separation  where  they 
attempted  to  separate  signals  from  two  human  speakers  from  a  single  microphone.  In  almost  all 
the  cases,  they  used  short  time  Fourier  transform  (STFT)  and  non-negative  matrix  factorization 
(NMF)  techniques. 

The  NMF  technique  was  first  introduced  by  Lee  and  Seung  (15,  16)  and  was  adopted  by  others 
to  minimize  the  cost  function 


(1) 


where  X  is  the  STFT  with  variables  in  frequency  co  and  time  t;  H  and  W  are  the  basis  and  weight 
matrices;  and  X  controls  the  sparcity  of  the  weights,  that  is,  fewer  weights,  hence  fewer  basis 
functions,  will  be  used.  The  fact  that  the  elements  of  Xat ,  H,  and  W  are  all  non-negative  gives 

the  algorithm  the  name  non-negative  matrix  factorization.  We  use  discrete  cosine  transform 
(DCT)  instead  of  STFT  to  avoid  the  problems  arising  due  to  complex  signals.  Let 
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Xf=dct(xt(t))  (2) 

be  the  DCT  of  the  signal  x;  (t).  It  is  found  that  first  500  of  the  DCT  coefficients  are  sufficient  to 

reconstruct  the  time  domain  signal  with  negligible  distortion.  It  is  worth  noting  that  earlier 
versions  of  JPEG  compression  schemes  used  DCT.  So  Xt  denotes  the  first  500  DCT  coefficients. 
Let  Bj  and  J3t  denote  the  positive  and  negative  DCT  coefficients  such  that  Xt  =  Bi  -J3i .  Let  the 

matrix  X  =  {Xj };  Vi  be  the  set  of  DCT  coefficients  for  all  the  training  data  corresponding  to 
the  people.  Then,  the  matrix  [Xp\  can  be  written  as 

[xp]=[x;]-[xp]  O) 

with  matrix  [X+p  ]  =  }  representing  the  positive  DCT  coefficients  and  matrix  [X p  ]  =  {/?.} 

representing  the  negative  DCT  coefficients  of  X  .  Similarly,  Xa  represents  the  set  for  animals. 
After  performing  the  NMF  on  the  matrices,  we  get 


h;: 

(4) 

fc. 

| 

(5) 

The  matrices  W  and  W represent  the  weight  matrices  and  the  matrices  H  and  ff  correspond  to  the 
bases.  Once  the  basis  matrices  are  available,  they  can  be  used  to  represent  the  DCT  coefficients 
Xt  of  a  test  signal  x(t)  as  weighted  sum  of  their  components.  The  algorithm  to  estimate  the 
weights  and  bases  (subset  of  H  and  ff)  is  given  below: 

Algorithm  1: 

•  Step  1:  Normalize  the  test  signal  x(t)  after  removing  the  mean.  Compute  X,  =  dct  (x(t)).  Let 
X,  =  B,  -pt,  where  B,  and  [1,  denote  the  positive  and  negative  DCT  coefficients. 

•  Step  2:  Estimate  the  weights  W  =  {wl,w2,---,wr)  and  V  =  {v1,v2,-",vr}  such  that 

IBt  -  WHI2  +  |Pt  -  V  AT  I2  ;  such  that  0  <  wi?  vj  <  ub;  Vi  G  { 1,  2,  •  •  •  ,  r}  (6) 

is  minimum,  Ub  is  typically  1.  One  may  use  any  constrained  nonlinear  optimization 
program  such  as  the  “ fmincon ”  function  in  MATLAB  to  determine  the  weights. 

•  Step  3:  Non-zero  weights  w  and  v  give  the  bases  used  to  represent  Xt. 

•  Step  4:  Reconstruct  the  signal  x(t )  by  taking  the  inverse  DCT  of  the  difference 
(WH-VJSf). 

We  used  NMF  technique  (18)  to  separate  human  and  animal  signatures  from  a  single  seismic 
channel  data.  In  order  to  verify  the  technique,  we  took  two  seismic  signals,  one  from  a  person 
and  another  from  a  horse  walking,  and  mixed  them,  as  shown  in  figure  11.  Then  the  NMF 
technique  is  used  on  the  mixture  to  separate  the  signals.  The  results  of  NMF  technique  is  shown 
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in  figures  12  and  13.  From  these  figures,  we  notice  the  reconstruction  (separation)  of  the  signals 
is  good  except  in  the  places  where  there  is  noise.  The  NMF  algorithm  is  further  used  on  several 
sets  of  data  collected  on  a  single  person,  single  horse,  and  both  a  horse  and  a  person  walking. 

The  sum  of  the  weights  corresponding  to  the  bases  of  horse  Sh  and  a  person  Sp  determine  whether 
the  extracted  signature  belongs  to  a  horse  (animal)  or  a  person  depending  on  whether  Sh  >  Sp.  The 
values  of  Sh  and  Sp  are  plotted  in  figure  14  as  'o'  and  '*',  respectively.  From  the  figure,  we  find 
that  the  NMF  algorithm  provided  Sp  >  Sh  and  is  higher  than  the  threshold  shown  by  a  solid  line  at 
0.7  for  the  data  with  one  person  walking  majority  of  times  and  Sh  >  Sp  for  the  case  when  a  horse 
was  walking.  When  both  a  person  and  a  horse  walked,  both  Sh  and  Sp  are  above  the  threshold, 
indicating  that  both  the  targets  are  present.  So  we  can  detect  and  classify  the  footprint  signatures 
of  people  and  animals  even  when  both  are  present  at  the  same  time.  Traditional  classification 
algorithms  classify  only  any  one  of  the  targets  present  but  not  both  simultaneously.  They  fail  if 
both  targets  are  present.  Several  results  using  NMF  techniques  are  presented  in  the  literature  (39, 
40,  43,  44). 


Figure  1 1 .  Mixture  of  a  signature  from  a  person  and  a  horse. 
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Figure  12.  Extraction  of  human  signature. 


Reconstructed  Horse  Signature 


Impact  of  animal  footfall 


Figure  13.  Extraction  of  horse  signature. 


Man  Walking 


I 


Man  &  Horse  Walking 


Time  (sec) 


Figure  14.  Results  of  NMF  algorithm  on  signature  data 
of  (a)  man,  (b)  horse,  and  (c)  man  and  horse 
walking. 
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2.2  Ultrasonic  Sensor  Modeling  and  Ultrasonic  Signal  Processing 

In  this  section,  we  present  the  analysis  of  ultrasonic  sensor  data  to  characterize  and  discriminate 
both  people  and  animals.  The  ultrasonic  sensor  is  an  active  sensor  that  radiates  a  40-kHz 
ultrasonic  signal  and  captures  the  signals  that  are  bounced  back  by  the  target  in  its  beam.  The 
principle  of  operation  is  same  as  radar  (19).  The  micro-Doppler  returns  due  to  the  swinging  of 
the  arms,  legs,  and  torso  of  a  person  or  an  animal  are  analyzed.  We  take  advantage  of  these 
Doppler  returns  from  the  limbs  to  classify  the  targets.  However,  in  order  to  understand  the  type 
of  Doppler  signatures  that  would  be  generated  by  the  swinging  of  arms,  legs,  and  torso  of  various 
targets,  it  is  necessary  to  model  these  parts  and  compute  the  Doppler  values. 


Our  University  of  Mississippi  partners  Bradley  and  Sabatier  (20,  21)  have  explained  the 
observed  human-gait  features  in  Doppler  sonar  grams  by  using  the  Boulic-Thalmann  (BT)  (22) 
model,  shown  in  figure  15,  to  predict  joint  angle  time  histories  and  the  temporal  displacements  of 
the  body’s  center  of  mass.  In  the  BT  model,  body  segments  are  represented  as  ellipsoids. 
Temporally  dependent  velocities  at  the  proximal  and  distal  end  of  key  body  segments  are 
determined  from  the  BT  model,  as  shown  in  figure  16.  Doppler  sonar  grams  are  computed  by 
mapping  velocity-time  dependent  spectral  acoustic  cross  sections  for  the  body  segments  onto 
time-velocity  space,  mimicking  the  STFT  used  in  Doppler  sonar  processing.  Figure  17a  shows 
the  estimated  velocities  using  the  model  for  various  parts  of  the  body  for  a  6-ft-tall  person.  The 

2v 

Doppler  is  related  to  the  radial  velocity  vr  of  the  target  and  is  given  by  fd  =  — -fc,  where  c  is 

c 

the  propagation  velocity  of  sound  and  fc  is  the  radiated  carrier  frequency.  The  detailed 
computation  of  velocities  of  various  limbs  can  be  found  in  reference  20.  Similar,  models  are 
developed  (21)  for  a  quadruped,  such  as  a  horse.  Figure  17b  shows  the  estimated  velocities  for  a 
horse.  The  models  are  validated  using  experimental  results  and  have  been  reported  previously 
(41,  42,  45,  47). 


Human  Segment  Model 

0 


Segmented  link  model  -  horse 


Figure  15.  (a)  Human  model  as  a  stick  figure  and  (b)  horse  model. 
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Figure  16.  Estimation  of  velocities  of  various  parts  of  body. 


Figure  17.  Estimated  velocity  (Doppler)  for  (a)  a  person  walking  and  (b)  a  horse  walking. 

Figure  18  shows  the  Doppler  signature  collected  for  a  person  walking  in  one  of  the  scenarios. 
Each  sensor  records  the  data  when  the  person  walks  by  the  sensor.  The  data  collection  is  done 
when  a  person  (or  a  horse)  walks  toward  the  sensor  at  close  range  (figure  18b)  and  when  the 
person  (or  a  horse  walks  away  from  the  sensor  at  a  distance  and  to  one  side  of  the  sensor  [figure 
18c]).  In  the  first  case,  the  signal  strength  is  high  and  the  Doppler  returns  from  the  various  parts 
are  clearly  visible,  while  in  the  latter  case  the  Doppler  returns  are  weak  and  the  features  are  not 
clearly  visible.  Hence,  we  developed  two  algorithms  to  classify  the  targets,  namely,  (1)  when  the 
signal-to-noise  ratio  (SNR)  is  high  (>6  dB)  and  (2)  when  the  SNR  is  low. 
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Figure  18.  (a)  Doppler  returns  for  a  person  walking,  (b)  measured  Doppler  when  a  person 
walks  towards  the  sensor,  and  (c)  Doppler  returns  when  a  person  walks  away 
from  the  sensor. 

2.2.1  Case  1:  High  Signal-To-Noise  Ratio  (SNR) 

In  this  case,  the  targets  walk  in  front  of  the  sensor  at  a  close  proximity,  and  the  atmospheric 
(wind)  effects  on  the  received  signal  are  minimal.  This  is  the  case  where  some  model  features 
can  be  clearly  identified,  and  the  classification  can  be  made  based  on  the  model.  An  example  of 
high  SNR  is  shown  in  figure  1 8b.  In  order  to  characterize  the  target  either  as  a  person  or  an 
animal,  we  look  for  (a)  cadence,  (b)  maximum  and  minimum  variation  in  the  Doppler  frequency 
due  to  limbs,  and  (c)  the  sequential  nature  of  limb  movements.  Figure  19  shows  the  enlarged 
version  of  figure  18b.  It  shows  the  average  Doppler  of  the  torso  (average  velocity  of  a  person 
walking)  and  the  Doppler  due  to  arms  and  legs.  When  the  arms  and  legs  swing 
forward/backward,  we  get  a  Doppler  above/below  the  average  (the  sinusoidal  line  above/below 
the  average  line).  The  cadence  is  estimated  as  j/ ,  where  t  is  the  time  between  two  peaks  of  a 

sinusoidal  curve.  The  cadence  is  estimated  to  be  1.8  Hz  for  the  person.  The  maximum  and 
minimum  Doppler  frequency  of  limbs  with  respect  to  the  average  is  found  to  be  +800  Hz,  and 
this  will  be  contrasted  with  the  values  for  an  animal.  The  vertical  lines  on  figure  19  are  drawn  to 
show  the  lag  (sequential  nature  of  limb  movement)  between  the  top  and  lower  sinusoidal  curves. 
The  lag  is  ~0.1  s. 
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Figure  19.  Measured  Doppler  output  for  a  person. 

Figure  20  shows  the  ultrasonic  returns  from  a  horse  walking.  One  clear  distinction  between  the 
Doppler  signatures  for  a  person  and  a  horse  walking  is  that  the  signature  for  a  person  is 
sinusoidal  in  nature.  Horse  motion  is  much  more  complex,  as  there  are  many  more  moving  parts 
for  a  horse  than  a  person.  Another  distinct  feature  for  the  horse  is  that  the  Doppler  returns  below 
the  “average  torso  Doppler”  are  significantly  less.  The  maximum  variation  of  Doppler  for  the 
horse  (-1500  Hz)  is  higher  compared  than  that  for  a  person  (-1100  Hz).  Figure  21  shows  the 
Doppler  energy  plot  for  a  horse  walking.  The  peaks  in  figure  21  show  the  periodic  nature  of  a 
horse  walking;  the  cadence  can  be  estimated  from  it.  The  cadence  of  the  horse  is  estimated  to  be 
around  -1.7  Hz.  This  cadence  value  is  significantly  low  for  a  horse;  this  is  due  to  the  fact  that  the 
horse  is  made  to  walk  slowly  on  purpose.  This  is  also  verified  using  the  seismic  data.  The 
cadence  of  the  horse  is  found  to  be  close  to  the  cadence  of  a  person  walking.  Hence,  cadence 
alone  cannot  be  used  to  distinguish  a  person  from  a  horse  or  any  other  quadruped.  From 
figure  21,  we  notice  that  each  peak  has  an  adjacent  smaller  peak  marked  by  ellipses  in  the  figure. 
This  double  peaked  result  is  characteristic  of  a  quadruped  walking  and  is  also  seen  in  the  model 
shown  in  figure  17b.  The  time  difference  between  two  adjacent  peaks  is  -0.12  s.  The  algorithm 
to  distinguish  people  and  animals  using  the  features  described  earlier  is  given  in  the  flowchart 
shown  in  figure  22. 
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Figure  20.  Measured  Doppler  for  a  horse. 


Figure  21.  Doppler  energy  in  500-1100  Hz  band  for  a  horse  walking. 


Likelihood  of 
a  person 

- > 


Figure  22.  Classification  of  ultrasonic  data. 

2.2.2  Case  2:  Low  Signal-To-Noise  Ratio 

In  the  event  the  signal  returns  are  weak  for  any  reason  (such  as  prevailing  winds,  the  target  being 
farther  than  optimal  distance,  the  target  being  illuminated  by  the  ultrasonic  transducer  at  an 
angle,  etc.),  the  features  observed  in  the  case  of  high  SNR  may  not  be  present,  as  seen  in 
figure  18c.  For  low  SNR  data,  it  is  appropriate  to  use  classical  signal  processing  techniques  to 
classify  the  targets.  We  used  a  support  vector  machine  (SVM)  for  classification,  as  shown  in 
figure  22. 
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3.  Conclusions 


In  this  report,  we  presented  the  data  collection  done  at  the  southwest  border  of  United  States 
during  March  2012.  We  also  presented  the  advances  made  in  signal  processing  for  personnel 
detection.  The  main  focus  had  been  on  acoustic,  seismic,  and  ultrasonic  data  processing. 

Acoustic  data  are  processed  to  determine  the  presence  of  a  human  voice  and  the  sounds  due  to 
footsteps.  If  the  human  voice  is  detected  reliably,  the  cadence  of  the  footsteps  is  ~1.5  Hz,  and  the 
energy  distribution  in  the  spectral  bands  determines  it  belongs  to  a  person,  the  sensor  fusion 
algorithm  classifies  the  target  as  being  a  person  with  high  confidence. 

The  seismic  sensor  data  are  analyzed  for  cadence  and  the  distribution  of  its  harmonics  are 
analyzed  to  determine  if  they  belong  to  a  person  or  an  animal.  Furthermore,  the  seismic 
signatures  are  classified  using  NMF  to  confirm  whether  they  belong  to  a  person  or  an  animal. 

We  can  determine  even  if  both  people  and  animals  are  present  at  the  same  time  since  we  separate 
their  signals.  A  fusion  algorithm  combines  both  the  results  to  assign  an  overall  classification 
rating.  Integration  of  the  results  over  a  period  of  time  results  in  higher  confidence  in  the 
classification. 

The  ultrasonic  sensor  provides  high  fidelity  Doppler  data  of  arm,  leg,  and  torso  movements  of 
people  and  animals.  These  Doppler  returns  are  processed  with  the  algorithms  developed  using 
physics-based  phenomenological  models  to  determine  the  classification  of  the  targets.  When  the 
SNR  is  >6  dB,  the  classification  is  very  accurate  (within  the  95%  range.) 


17 


4.  References 


1.  Houston,  K.  M.;  McGaffigan,  D.  P.  Spectrum  Analysis  Techniques  for  Personnel  Detection 
Using  Seismic  Sensors.  Proc.  ofSPIE  2003,  5090,  162-173. 

2.  Sabatier,  James  M.;  Ekimov,  Alexander.  Range  Limitation  for  Seismic  Footstep  Detection. 
Proc.  of  SP IE  2008,  6963,  69630V- 1. 

3.  Ekimov,  Alexander;  Sabatier,  James  M.  Passive  Ultrasonic  Method  for  Human  Footstep 
Detection.  Proc.  ofSPIE  2007,  6562,  656203. 

4.  Sunderesan,  A.;  Subramanian,  A.;  Varshney,  P.  K.;  Damarla,  T.  A  Copula  Based  Semi- 
Parametric  Approach  for  Footstep  Detection  Using  Seismic  Sensor  Networks.  Proc.  ofSPIE 
2010,  7710,  77100C. 

5.  Bland,  R.  E.  Acoustic  and  Seismic  Signal  Processing  for  Footstep  Detection',  Master’s 
Thesis,  Massachusetts  Institute  of  Technology,  Department  of  Electrical  Engineering  and 
Computer  Science,  2006. 

6.  Iyengar,  S.  G.;  Varshney,  P.  K.;  Damarla,  T.  On  the  Detection  of  Footsteps  Based  on 
Acoustic  and  Seismic  Sensing,  in  Conference  Record  of  the  Forty  First  Asilomar  Conference 
on  Signals,  Systems  and  Computers,  ACSSC  2007,  pp.  2248-2252,  4-7  November  2007. 

7.  Damarla,  T.;  Ufford,  David.  Personnel  Detection  Using  Ground  Sensors.  Proc.  ofSPIE  2007, 
6562,  656205. 

8.  Encyclopedia  of  Acoustics;  Vol.  4,  Edited  by  Malcolm  J.  Crocker,  Published  by  John  Wiley 
&  Sons,  Inc.  New  York,  NY,  1997. 

9.  Hyvarinen,  A.;  Karhunen,  J.;  Oja,  E.  Independent  Component  Analysis;  John  Wiley  &  Sons, 
New  York,  NY  10158,  2001. 

10.  Schmidt,  M.  N.;  Morup,  M.  Nonnegative  Matrix  Factor  2-d  Deconvolution  for  Blind  Single 
Channel  Source  Separation,  in  Independent  Component  Analysis  2006,  700-707. 

11.  Schmidt,  M.  N.;  Olsson,  R.  K.  Single-Channel  Speech  Separation  Using  Sparse  Non- 
Negative  Matrix  Factorization,  in  International  Conference  on  Spoken  Language  Processing 
(INTERSPEECH  2006. 

12.  Roweis,  S.  T.  One  Microphone  Source  Separation.  In  Advances  in  Neural  Information 
Processing  Systems  2000,  13,  793-799  (MIT  Press). 


18 


13.  Jang,  G.;  won  Lee,  T.;  Cardoso,  J.  Franois;  Oja,  E.;  Amari,  S.  I.  A  Maximum  Likelihood 
Approach  to  Single-Channel  Source  Separation.  Journal  of  Machine  Learning  Research 
2003,  4,  1365-1392. 

14.  King,  B.;  Atlas,  L.  Single-Channel  Source  Separation  Using  Simplified-Training  Complex 
Matrix  Factorization,  in  Acoustics  Speech  and  Signal  Processing  (ICASSP),  2010  IEEE 
International  Conference  on ,  4206-4209,  2010. 

15.  Lee,  D.  D.;  Seung,  H.  Learning  the  Parts  of  Objects  by  Non-Negative  Matrix  Factorization, 
in  Nature  1999,  401,  788-791. 

16.  Lee,  D.  D.;  Seung,  H.  Algorithm  for  Non-Negative  Matrix  Factorization,  in  NIPS  2001, 13, 
556-562. 

17.  King,  B.;  Atlas,  L.  Single-Channel  Source  Separation  Using  Simplified-Training  Complex 
Matrix  Factorization,  in  Acoustics  Speech  and  Signal  Processing  (ICASSP),  2010  IEEE 
International  Conference  on,  4206-4209,  2010. 

18.  Mehmood,  A.;  Damarla,  T.;  Sabatier,  J.  Discrimination  of  People  From  Animals  Using  Non- 
Negative  Matrix  Factorization  in  Seismic  Signatures,  under  2nd  review  Journal  of  Acoustical 
Society  of  America. 

19.  Geisheimer,  J.  L.;  Marshall,  W.  S.;  Greneker,  E.  A  Continuous- Wave  (cw)  Radar  for  Gait 
Analysis,  in  Proc.  of  35th  Asilomar  conference  on  Signals,  Systems  and  Computers,  2001, 
pp.  834-838. 

20.  Bradley,  M.;  Sabatier,  J.  M.  Applications  of  Fresnel-Kirchhoff  Diffraction  Theory  in  the 
Analysis  of  Human-Motion  Doppler  Sonar  Grams.  Journal  Acoustical  Society  of  America 
Express  Letters  Nov  2010, 128,  1-10. 

21.  Bradley,  M.;  Sabatier,  J.  M.  Using  Acoustic  Micro-Doppler  Sonar  to  Distinguish  Between 
Human  and  Equine  Motion,  in  NATO  Workshop  on  Autonomous  Sensing  and  Multi-Sensor 
Integration  SET-176  RSM,  Cardiff,  UK,  October  2011. 

22.  Boulic,  T.  R.;  Thalmann,  D.  A  Global  Human  Walking  Model  with  Real-Time  Kinematic 
Personification.  The  Visual  Computer  1990,  6,  344-358. 

23.  Mehmood,  A.;  Sabatier,  J.  M.;  Bradley,  M.;  Ekimov,  A.  Extraction  of  Walking  Human’s 
Body  Segments  Using  Ultrasonic  Doppler.  Acoustical  Society  of  America  Journal  2010, 128, 
316. 

24.  Damarla,  T.;  Sabatier,  J.  Sensor  Fusion  for  Personnel  Detection.  Proc.  NATO  SET-176 
Specialists  Meeting  on  Autonomous  Sensing  and  Multi-Sensor  Integration  for  ISR 
Applications,  Oct.  24-25,  2011,  Cardiff,  UK. 


19 


25.  Mitchell,  H.  B.  An  Introduction  to  Multi-Sensor  Data  Fusion ;  Springer- Verlag,  New  York, 
2007. 

26.  Hall,  D.  L.;  McMullen,  S.A.H.  Mathematical  Techniques  in  Multisensor  Data  Fusion',  Artech 
House,  Norwood,  MA,  2004. 

27.  Damarla,  T.;  Mehmood,  A.;  Sabatier,  J.  Detection  of  People  and  Animals  Using  Non- 
Imaging  Sensors,  in  14th  Inti.  Conference  on  Information  Fusion,  Chicago,  IL,  pp.  429-436, 
July  5-8,  2011. 

28.  Sen,  P.;  Damarla,  T.;  Hasegawa-Johnson,  M.  Multi-Sensory  Features  for  Personnel  Detection 
at  Border  Crossings,  in  14th  Inti.  Conference  on  Information  Fusion,  Chicago,  IL,  pp.  421— 
428,  July  5-8,  2011. 

29.  Rabiner,  L.  R.;  Juang,  B.-H.;  Levinson,  S.  E.;  Sondhi,  M.  M.  Recognition  of  Isolated  Digits 
Using  Hidden  Markov  Models  with  Continuous  Mixture  Densities.  AT&Technical  Journal 
1985,  64  (6  pt  1),  1211-1234. 

30.  Jin,  X.;  Gupta,  S.;  Ray,  A.;  Damarla,  T.  Multimodal  Sensor  Fusion  for  Personnel  Detection, 
in  Proc.  14th  International  Conference  on  Information  Fusion,  Chicago,  IL,  July  5-8,  2011, 
pp.  437^144. 

31.  Iyengar,  Satish  G.;  Varshney,  Pramod  K.;  Damarla,  Thyagaraju.  A  Parametric  Copula-based 
Framework  for  Hypothesis  Testing  using  Heterogeneous  Data.  IEEE  Trans,  on  Signal 
Processing  May  2011,  59  (5),  2308-2319. 

32.  Jin,  Xin;  Sarkar,  S.;  Ray,  A.;  Gupta,  S.;  Damarla,  T.  Target  Detection  and  Classification 
Using  Seismic  and  PIR  Sensors.  IEEE  Sensors  Journal  2011. 

33.  Ray,  A.  Symbolic  Dynamic  Analysis  of  Complex  Systems  for  Anomaly  Detection.  Signal 
Processing  2004,  84  (7),  1115-1130. 

34.  Gupta,  S.;  Ray,  A.  Symbolic  Dynamic  Filtering  for  Data-Driven  Pattern  Recognition',  in 
Pattern  Recognition:  Theory  and  Application.  Nova  Science  Publishers,  Hauppage,  NY, 

2007,  ch.  2,  pp.  17-71. 

35.  Vidal,  E.;  Thollard,  F.;  Higuera,  C.;  Casacuberta,  F.;  Carrasco,  R.  C.  Probabilistic  Finite- 
State  Machines-Part  I.  IEEE  Transactions  on  Pattern  Analysis  and  Machine  Intelligence 
July  2005,  27  (7),  1013-1025. 

36.  Damarla,  T.;  Walker,  T.;  Sartain,  R.  Data  Collection  and  Analysis  for  Personnel  Detection  at 
a  Border  Crossing',  ARL-TR-5426;  U.S.  Army  Research  Laboratory:  Adelphi,  MD,  2011. 

37.  Mehmood;  Damarla,  T.  Kernel  Non-Negative  Matrix  Factorization  for  Seismic  Signature 
Separation.  Journal  of  Pattern  Recognition  Research  June  2013,  5(1),  13-25. 


20 


38.  Damarla,  T.;  Bradley,  M.;  Mehmood,  A.;  Sabatier,  J.  M.  Classification  of  Animals  and 
People  Ultrasonic  Signatures.  IEEE  Sensors  Journal  May  2013,  13  (5),  1464-1472. 

39.  Damarla,  T.;  Mehmood,  A.;  Sabatier,  J.  M.  Seismic  Signature  Analysis  for  Discrimination  of 
People  From  Animals.  Proc.  ofSPIE  Apr.  2013,  8745,  874518-1. 

40.  Mehmood,  A.;  Damarla,  T.;  Sabatier,  J.  M.  Separation  of  Human  and  Animal  Seismic 
Signatures  Using  Non-Negative  Matrix  Factorization.  Pattern  Recognition  Letters  December 
2012,  33  (16),  2085-2093. 

41.  Mehmood,  A.;  Sabatier,  J.  M.;  Damarla,  T.  Ultrasonic  Doppler  Methods  to  Extract 
Signatures  of  a  Walking  Human.  J.  Acoust.  Soc.  Am.  September  2012, 132  (3),  pp.  EL243- 
EL249. 

42.  Mehmood,  A.;  Sabatier,  J.  M.  Extraction  of  the  Velocity  of  Walking  Human’s  Body 
Segments  Using  Ultrasonic  Doppler.  The  Journal  of  the  Acoustical  Society  of  America  Nov. 
2010, 128  (5),  EL316-EL322. 

43.  Mehmood,  A.;  Patel,  V.  M.;  Damarla,  T.  Discrimination  of  Bipeds  from  Quadrupeds  using 
Seismic  Footstep  Signatures.  IEEE  IGARSS  22-27  July  2012,  6920-6923. 

44.  Mehmood,  A.;  Damarla,  T.  Blind  Separation  of  Human  and  Horse  Footstep  Signatures  Using 
Independent  Component  Analysis.  Proc.  SPIE  2012,  8382,  83820L. 

45.  Mehmood,  A.;  Sabatier,  J.  M;  Damarla,  T.  Review  of  Ultrasonic  Doppler  Measurements. 
NATO  Workshop  on  Autonomous  Sensing  and  Multi-Sensor  Integration  SET-176  RSM, 
October  2011. 

46.  Damarla,  T.;  Mehmood,  A.;  Sabatier,  J.  M.  Detection  of  People  and  Animals  using  Non- 
Imaging  Sensors.  14th  International  IEEE  Conf.  on  Information  Fusion,  pp.  429-436,  2011. 

47.  Damarla,  T.;  Mehmood,  A.;  Sabatier,  J.  M.  Personnel  Detection  using  Acoustic,  Seismic  & 
Ultrasonic  Signatures.  8th  NATO  Symposium  on  Military  Sensing  SET-169  RSY-025  MSS, 
2011. 

48.  Mehmood,  A.;  Sabatier,  J.  M.  Video  Validation  of  Human  Leg  Velocities  from  Doppler 
Ultrasound.  Proc.  Military  Sensing  Symposium  on  Battlefield  Acoustic  and  Magnetic 
Sensors,  August  2010. 


21 


NO.  OF 

COPIES  ORGANIZATION 


1  DEFENSE  TECHNICAL 
(PDF)  INFORMATION  CTR 

DTIC  OCA 

2  DIRECTOR 

(PDFS)  US  ARMY  RESEARCH  LAB 
RDRL  CIO  LL 

IMAL  HRA  MAIL  &  RECORDS  MGMT 

1  GOVT  PRINT  G  OFC 

(PDF)  AMALHOTRA 

7  DIRECTOR 

(PDFS)  US  ARMY  RESEARCH  LAB 
RDRL  SES  A 

THYAGARAJU  DAMARLA 
RONALD  A.  FRANKEL 
ASIF  MEHMOOD 
JAMES  M.  SABATIER 
GARY  CHATTERS 
ATTN  RDRL  SES  P 
HAO  VU 

ATTN  RDRL  SES  E 
MATTHEW  THIELKE, 


22 


